255 research outputs found

    Generating Natural Language from Linked Data:Unsupervised template extraction

    Get PDF
    We propose an architecture for generating natural language from Linked Data that automatically learns sentence templates and statistical document planning from parallel RDF datasets and text. We have built a proof-of-concept system (LOD-DEF) trained on un-annotated text from the Simple English Wikipedia and RDF triples from DBpedia, focusing exclusively on factual, non-temporal information. The goal of the system is to generate short descriptions, equivalent to Wikipedia stubs, of entities found in Linked Datasets. We have evaluated the LOD-DEF system against a simple generate-from-triples baseline and human-generated output. In evaluation by humans, LOD-DEF significantly outperforms the baseline on two of three measures: non-redundancy and structure and coherence.

    A Semantic Web of Know-How: Linked Data for Community-Centric Tasks

    Full text link
    This paper proposes a novel framework for representing community know-how on the Semantic Web. Procedural knowledge generated by web communities typically takes the form of natural language instructions or videos and is largely unstructured. The absence of semantic structure impedes the deployment of many useful applications, in particular the ability to discover and integrate know-how automatically. We discuss the characteristics of community know-how and argue that existing knowledge representation frameworks fail to represent it adequately. We present a novel framework for representing the semantic structure of community know-how and demonstrate the feasibility of our approach by providing a concrete implementation which includes a method for automatically acquiring procedural knowledge for real-world tasks.Comment: 6th International Workshop on Web Intelligence & Communities (WIC14), Proceedings of the companion publication of the 23rd International Conference on World Wide Web (WWW 2014

    Supporting text mining for e-Science: the challenges for Grid-enabled natural language processing

    Get PDF
    Over the last few years, language technology has moved rapidly from 'applied research' to 'engineering', and from small-scale to large-scale engineering. Applications such as advanced text mining systems are feasible, but very resource-intensive, while research seeking to address the underlying language processing questions faces very real practical and methodological limitations. The e-Science vision, and the creation of the e-Science Grid, promises the level of integrated large-scale technological support required to sustain this important and successful new technology area. In this paper, we discuss the foundations for the deployment of text mining and other language technology on the Grid - the protocols and tools required to build distributed large-scale language technology systems, meeting the needs of users, application builders and researchers

    Exploring data-in-use: the value of data for Local Government

    Get PDF
    The power of data to support digital transformation within the context of e-Government is frequently underestimated. In this exploratory research, we develop a conceptual framework where the value of data stems from how it is used. We claim that the impact of digital transformation in the public sector presupposes an organisational culture that recognises and values data-in-use, by which is meant the practical application of data for a specific purpose, particularly by staff who deliver services. Through the lens of two ‘worldviews’ of data sharing, we present case studies of data use in two local authorities in Scotland. We claim that developing a culture where data is leveraged to derive insights for organisational activity requires combining working practices and technical infrastructure that centre on co-creating value with data. The presence of data intermediaries can support effective data-in-use to establish a healthy internal data ecosystem. Our research illustrates that local authorities within Scotland are still at an early stage of developing this culture.Die Bedeutung von Daten für die digitale Transformation im Kontext von eGovernment wird häufig unterschätzt. In diesem explorativ angelegten Artikel wird ein konzeptioneller Rahmen entwickelt, bei dem der Wert von Daten für eGovernment von deren Nutzung bestimmt wird. Argumentiert wird, dass die Verwirklichung der Potenziale der digitalen Transformation im öffentlichen Sektor eine Organisationskultur voraussetzt, die data-in-use versteht und deren Wert erkennt. Mit "data-in-use" ist die praktische Nutzung von Daten für einen spezifischen Zweck durch Verwaltungsmitarbeiter*innen gemeint. Empirisch basiert der Artikel auf zwei Fallstudien zur Datennutzung in schottischen Kommunalverwaltungen, die unterschiedliche Formen des Datenaustauschs repräsentieren. Die Analyse zeigt, dass ein Fokus auf Wertschöpfung (Value Co-Creation) durch Daten bei Arbeitsabläufen und technischer Infrastruktur erforderlich ist, um eine wirksame Datennutzungskultur zu entwickeln. Der Einsatz von Intermediären kann zu einer effektiven Datennutzung in einem internen Datenökosystem beitragen. Im Ergebnis wird gezeigt, dass sich Kommunalverwaltungen in Schottland noch am Anfang des Weges hin zu einer solchen Organisationskultur befinden

    Computational semantics in the Natural Language Toolkit

    Get PDF
    NLTK, the Natural Language Toolkit, is an open source project whose goals include providing students with software and language resources that will help them to learn basic NLP. Until now, the program modules in NLTK have covered such topics as tagging, chunking, and parsing, but have not incorporated any aspect of semantic interpretation. This paper describes recent work on building a new semantics package for NLTK. This currently allows semantic representations to be built compositionally as a part of sentence parsing, and for the representations to be evaluated by a model checker. We present the main components of this work, and consider comparisons between the Python implementation and the Prolog approach developed by Blackburn and Bos (2005).
    corecore